(Alzheimer’s Association, 2019)
This project focuses on the global mortality rate of Alzheimer’s disease, alongside other forms of dementia. The raw dataset was accessed via Our World in Data (World Health Organization, 2025), derived from the original World Health Organisation’s (WHO) Global Health Estimates (GHE) of 2023 (World Health Organization, 2023).
WHO’s GHE examines global death and disability statistics, by region, country, sex, age and cause, per 100,000 people. This dataset focuses on the trends of the mortality rate of Alzheimer’s disease from 2000 to 2021, produced from national vital registration data, latest estimates from WHO technical programs, United Nations partners and inter-agency groups, and the Global Burden of Disease.
Alzheimer’s disease and other dementias are rapidly growing global public health concerns, driven largely by aging populations and increasing life expectancy. Understanding how mortality rates vary across the globe allows researchers and medical professionals to examine where the burden is rising most sharply, assess the effectiveness of healthcare systems, and allocate resources more effectively.
By comparing long-term trends in the world’s most populous countries: India, China and the United States, this analysis provides insight into how demographic transitions, healthcare infrastructure, and diagnostic practices may contribute to the global impact of dementia-related mortality.
Such comparisons are essential for guiding prevention strategies and preparing for the future demands of an aging world.
How does Alzheimer’s disease mortality rate in the world’s three most populated countries (India, China, United States of America) compare across the 2000–2021 period?
World Health Organization. (2023). Global Health Estimates. Www.who.int. https://www.who.int/data/global-health-estimates
World Health Organization . (2025). Death rate from Alzheimer’s. Our World in Data.
First you need to load the following packages into the console, I had to use all of these due to operating off of a Linux computer. For Windows or MacOS you probably won’t need all of these packages, just tidyverse, here, ggplot2 and possibly plotly.
library(tidyverse)
library(here)
library(ggplot2)
library(readr)
library(scatterplot3d)
library(plotly)
library(dplyr)
library(viridis)
Next you need to import the raw data set into R and give it a more convenient name.
death_rate <- read_csv("Raw_Data/death_rate.csv", show_col_types = FALSE)
This is a summary of the raw dataset, and a list of the column names, as well as the first few rows of the data.
summary(death_rate)
## Entity Year
## Length:4422 Min. :2000
## Class :character 1st Qu.:2005
## Mode :character Median :2010
## Mean :2010
## 3rd Qu.:2016
## Max. :2021
## Death rate from alzheimer disease and other dementias among both sexes
## Min. : 0.000
## 1st Qu.: 4.192
## Median : 7.285
## Mean : 15.434
## 3rd Qu.: 16.270
## Max. :314.530
names(death_rate)
## [1] "Entity"
## [2] "Year"
## [3] "Death rate from alzheimer disease and other dementias among both sexes"
head(death_rate, 10)
## # A tibble: 10 × 3
## Entity Year Death rate from alzheimer disease and other dementias amo…¹
## <chr> <dbl> <dbl>
## 1 Afghanistan 2000 4.62
## 2 Afghanistan 2001 4.68
## 3 Afghanistan 2002 4.73
## 4 Afghanistan 2003 4.75
## 5 Afghanistan 2004 4.81
## 6 Afghanistan 2005 4.93
## 7 Afghanistan 2006 5.05
## 8 Afghanistan 2007 4.91
## 9 Afghanistan 2008 4.83
## 10 Afghanistan 2009 4.9
## # ℹ abbreviated name:
## # ¹`Death rate from alzheimer disease and other dementias among both sexes`
For this project, I decided to only analyse the responses from the three most populated countries in the world. To reflect this in the clean data, I removed most of the countries as variables and only focused on India, China and the US.
The original dataset consisted of 4422 obs.
unique(death_rate$Entity) # This will show you all of your countries in the dataset before you filter the data, it's not a necessary check but it is a good one to do to make sure your original dataset has imported correctly.
## [1] "Afghanistan" "Africa"
## [3] "Albania" "Algeria"
## [5] "Andorra" "Angola"
## [7] "Antigua and Barbuda" "Argentina"
## [9] "Armenia" "Asia"
## [11] "Australia" "Austria"
## [13] "Azerbaijan" "Bahamas"
## [15] "Bahrain" "Bangladesh"
## [17] "Barbados" "Belarus"
## [19] "Belgium" "Belize"
## [21] "Benin" "Bhutan"
## [23] "Bolivia" "Bosnia and Herzegovina"
## [25] "Botswana" "Brazil"
## [27] "Brunei" "Bulgaria"
## [29] "Burkina Faso" "Burundi"
## [31] "Cambodia" "Cameroon"
## [33] "Canada" "Cape Verde"
## [35] "Central African Republic" "Chad"
## [37] "Chile" "China"
## [39] "Colombia" "Comoros"
## [41] "Congo" "Cook Islands"
## [43] "Costa Rica" "Cote d'Ivoire"
## [45] "Croatia" "Cuba"
## [47] "Cyprus" "Czechia"
## [49] "Democratic Republic of Congo" "Denmark"
## [51] "Djibouti" "Dominica"
## [53] "Dominican Republic" "East Timor"
## [55] "Ecuador" "Egypt"
## [57] "El Salvador" "Equatorial Guinea"
## [59] "Eritrea" "Estonia"
## [61] "Eswatini" "Ethiopia"
## [63] "Europe" "Fiji"
## [65] "Finland" "France"
## [67] "Gabon" "Gambia"
## [69] "Georgia" "Germany"
## [71] "Ghana" "Greece"
## [73] "Grenada" "Guatemala"
## [75] "Guinea" "Guinea-Bissau"
## [77] "Guyana" "Haiti"
## [79] "Honduras" "Hungary"
## [81] "Iceland" "India"
## [83] "Indonesia" "Iran"
## [85] "Iraq" "Ireland"
## [87] "Israel" "Italy"
## [89] "Jamaica" "Japan"
## [91] "Jordan" "Kazakhstan"
## [93] "Kenya" "Kiribati"
## [95] "Kuwait" "Kyrgyzstan"
## [97] "Laos" "Latvia"
## [99] "Lebanon" "Lesotho"
## [101] "Liberia" "Libya"
## [103] "Lithuania" "Luxembourg"
## [105] "Madagascar" "Malawi"
## [107] "Malaysia" "Maldives"
## [109] "Mali" "Malta"
## [111] "Marshall Islands" "Mauritania"
## [113] "Mauritius" "Mexico"
## [115] "Micronesia (country)" "Moldova"
## [117] "Monaco" "Mongolia"
## [119] "Montenegro" "Morocco"
## [121] "Mozambique" "Myanmar"
## [123] "Namibia" "Nauru"
## [125] "Nepal" "Netherlands"
## [127] "New Zealand" "Nicaragua"
## [129] "Niger" "Nigeria"
## [131] "Niue" "North America"
## [133] "North Korea" "North Macedonia"
## [135] "Norway" "Oceania"
## [137] "Oman" "Pakistan"
## [139] "Palau" "Panama"
## [141] "Papua New Guinea" "Paraguay"
## [143] "Peru" "Philippines"
## [145] "Poland" "Portugal"
## [147] "Qatar" "Romania"
## [149] "Russia" "Rwanda"
## [151] "Saint Kitts and Nevis" "Saint Lucia"
## [153] "Saint Vincent and the Grenadines" "Samoa"
## [155] "San Marino" "Sao Tome and Principe"
## [157] "Saudi Arabia" "Senegal"
## [159] "Serbia" "Seychelles"
## [161] "Sierra Leone" "Singapore"
## [163] "Slovakia" "Slovenia"
## [165] "Solomon Islands" "Somalia"
## [167] "South Africa" "South America"
## [169] "South Korea" "South Sudan"
## [171] "Spain" "Sri Lanka"
## [173] "Sudan" "Suriname"
## [175] "Sweden" "Switzerland"
## [177] "Syria" "Tajikistan"
## [179] "Tanzania" "Thailand"
## [181] "Togo" "Tonga"
## [183] "Trinidad and Tobago" "Tunisia"
## [185] "Turkey" "Turkmenistan"
## [187] "Tuvalu" "Uganda"
## [189] "Ukraine" "United Arab Emirates"
## [191] "United Kingdom" "United States"
## [193] "Uruguay" "Uzbekistan"
## [195] "Vanuatu" "Venezuela"
## [197] "Vietnam" "World"
## [199] "Yemen" "Zambia"
## [201] "Zimbabwe"
# Define the countries and the global entity you want to keep, you can just do the 3 countries here but I added Global just in case it would expand on the visualisations- it did not do much so next time I probably wouldn't include it.
target_countries <- c("India", "China", "United States", "Global")
# Rename the long column name (I renamed mine Mortality_rate)
death_rate <- death_rate %>%
rename(Mortality_rate = `Death rate from alzheimer disease and other dementias among both sexes`)
# Look at the new names
names(death_rate)
## [1] "Entity" "Year" "Mortality_rate"
#Filter the death_rate dataset to include only the target countries and the years 2000 to 2021
# This assumes your 'Country' column holds the name of the entity and your 'Year' column holds the year. Adjust names if needed based on names(death_rate).
filtered_data <- death_rate %>%
filter(Entity %in% target_countries) %>%
filter(Year >= 2000 & Year <= 2021)
After filtering for the three target countries, some rows contained missing mortality values. These incomplete entries were removed using na.omit() to ensure that the dataset was clean and suitable for accurate plotting and statistical interpretation.
filtered_data <- na.omit(filtered_data)
# Look at the final clean structure
head(filtered_data)
## # A tibble: 6 × 3
## Entity Year Mortality_rate
## <chr> <dbl> <dbl>
## 1 China 2000 14.7
## 2 China 2001 15.3
## 3 China 2002 16.0
## 4 China 2003 16.7
## 5 China 2004 17.6
## 6 China 2005 18.4
summary(filtered_data)
## Entity Year Mortality_rate
## Length:66 Min. :2000 Min. : 4.580
## Class :character 1st Qu.:2005 1st Qu.: 8.043
## Mode :character Median :2010 Median :22.600
## Mean :2010 Mean :30.998
## 3rd Qu.:2016 3rd Qu.:43.532
## Max. :2021 Max. :92.960
range(filtered_data$Year)
## [1] 2000 2021
Your new, clean dataset should consist of 66 obs.
Before I started to configure the visualisations, I decided to assign each target country a different colour:
ggplot(filtered_data, aes(x = Year, y = Mortality_rate, colour = Entity)) +
geom_line(linewidth = 2) + # Thicker lines to improve overall appearance
theme_minimal() +
labs(
title = "Alzheimer’s Mortality Rate (2000–2021)",
x = "Year",
y = "Deaths per 100,000",
colour = "Country"
) +
scale_colour_manual(values = c(
"India" = "lightpink",
"China" = "purple",
"United States" = "lightblue"
)) +
theme(
text = element_text(size = 14) # Makes labels easier to read for a generic audience, and makes it more accessible.
)
This line graph demonstrates the mortality rate for Alzheimer’s disease, per 100,000 people between the years 2000 and 2021. It is quite simple to comprehend, the higher up the line, the higher the mortality rate. From first glance, the US has a much higher Alzheimer’s-based mortality rate compared to China and India. The choice of using this graph initially is that it is simple, yet effective.
library(plotly)
# Interactive line plot with custom colors to match the previous graph, so as to eliminate any misunderstanding
plot_ly(filtered_data,
x = ~Year,
y = ~Mortality_rate,
color = ~Entity,
colors = c("India" = "lightpink",
"China" = "purple",
"United States" = "lightblue"),
type = 'scatter',
mode = 'lines+markers') %>%
layout(
title = "Alzheimer’s Mortality Rate (2000–2021)",
xaxis = list(title = "Year"),
yaxis = list(title = "Deaths per 100,000"),
legend = list(title = list(text='Country'))
)
I chose to expand on the original line graph and add some more detailing to it. I was really excited to try making an interactive visualisation, and I think that the interactive line graph in Figure 2 adds a sense of creativity whilst keeping the visualisation simple and easy to understand.
What becomes especially clear through this visualisation is the exponential increase in mortality rates from Alzheimer’s and dementia in the United States compared to the other countries. While all regions show some degree of growth over the 21-year period, the US displays a noticeably steeper upward trend.
The interactivity allows users to hover over specific points and directly observe how the US mortality rate accelerates more quickly and rises to higher levels than its counterparts. This not only reinforces the severity of the issue within the US healthcare and ageing population but also highlights the widening gap between the United States and other nations over time.
By enabling the viewer to isolate and compare each country’s trajectory, the graph makes the US’s disproportionately high increase impossible to miss, enhancing both the analytical depth and the overall storytelling of the data.
p <- ggplot(filtered_data, aes(x = Year, y = Entity, fill = Mortality_rate)) +
geom_tile(color = "white") +
scale_fill_gradient(low = "lightblue", high = "hotpink") + # light pink → dark purple
theme_minimal(base_size = 12) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 10),
legend.position = "right"
) +
labs(
title = "Heatmap of Alzheimer’s Mortality Rate (2000–2021)",
x = "Year",
y = "Country",
fill = "Deaths per 100k"
)
ggplotly(p, tooltip = c("x", "y", "fill"))
Honestly, Figure 3 was my own curiosity getting the better of me and wanting to make a more aesthetic visualisation. Figure 4 is a heatmap that separates each country into separate rows to demonstrate how each one has had an increase in mortality rates for Alzheimer’s over 21 years. Once again, we can see that the US changed from blue to pink, showing an increase in the mortality rate, whilst India remained in the lower range as demonstrated by the blue.
Considering at the start of this module I had no previous coding experience, I think this project has allowed me to illustrate just how much I have learnt both in and out of contact hours on the module. I have learnt how to code, first and foremost, but also how to create aesthetically pleasing visualisations that are easy to understand.
If I had more time, and honestly more patience with R Studio, I would compare the mortality rates across all the countries in the original dataset and attempt to produce some visualisations that demonstrate the differences between each country’s mortality rate. I would also like to delve a bit deeper generically and try to understand why some countries have a higher mortality rate linked to Alzheimer’s than others, perhaps focusing on more neural factors or even just environmental influences in different cultures.
(Alzheimer’s Society, 2016)
Alzheimer’s Association. (2019). Dementia vs. alzheimer’s disease: What is the difference? Alzheimer’s Association. https://www.alz.org/alzheimers-dementia/difference-between-dementia-and-alzheimer-s
Society, A. (2016). Facebook. Facebook.com. https://www.facebook.com/alzheimerssocietyuk/
World Health Organization. (2023). Global Health Estimates. Www.who.int. https://www.who.int/data/global-health-estimates
World Health Organization . (2025). Death rate from Alzheimer’s. Our World in Data. https://archive.ourworldindata.org/20250909-093708/grapher/death-rate-from-alzheimers-other-dementias-ghe.html?tab=table